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ABSTRACT 

The  National  Agricultural  Statistics  Service  (NASS)  plans  to  use  estimation  strategies  of  increasing 
complexity  in  the  future  and  will  need  to  estimate  the  variances  resulting  from  those  strategies. 
This  report  describes  a  relatively  simple  method  of  variance/mean  squared  error  estimation,  the 
delete-a-group  jackknife,  that  can  be  used  meaningfully  in  a  remarkably  broad  range  of  settings 
employing  complex  estimation  strategies.  The  text  describes  a  number  of  applications  of  the 
method  in  abstract  terms.  It  goes  on  to  shows  how  the  delete-a-group  jackknife  has  been  applied 
to  some  recent  NASS  surveys. 
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SUMMARY 


Historically,  NASS  has  employed  mostly  expansion  estimators  and  ratios  of  expansion  estimators 
based  on  stratified,  simple  random  samples  when  computing  indications  of  agricultural  activity. 
This  is  changing  due  in  no  small  part  to  increasing  demands  on  the  agency  to  make  more  efficient 
use  of  the  information  it  collects.  Fortunately,  parallel  increases  in  computing  power  are 
allowing  NASS  to  use  more  sophisticated  estimation  strategies  involving  multi-phase  sampling 
designs  and  calibration  estimators.  For  example,  the  1996  Agricultural  Resource  Management 
Study  (ARMS)  used  a  multi-phase  sampling  design  (a  first-phase  sample  is  randomly  drawn,  then 
a  second-phase  subsample  is  randomly  drawn  from  the  first-phase  sample,  and  so  forth).  Ratio 
adjustments  of  the  initial  inverse-probability  sampling  weights  capture  relevant  information  from 
the  ARMS  screening  phase  about  farms  not  selected  for  a  particular  survey  module. 

This  report  shows  how  different  variations  of  a  delete-a-group  jackknife  can  be  used  to  estimate 
the  variances  (or,  more  precisely,  the  mean  squared  errors)  of  a  variety  of  estimation  strategies. 
Many  of  these  are  strategies  are  currently  in  use  by  NASS. 

The  delete-a-group  jackknife  is  simple  to  use  once  appropriate  replicate  weights  are  constructed. 
By  contrast,  the  “linearization”  methods  traditionally  used  by  NASS  for  estimating  variances  can 
be  exceedingly  complicated  and  cumbersome  when  applied  to  complex  estimators  strategies. 
The  advantage  of  the  delete-a-group  jackknife  over  the  traditional,  delete-one-primary-sampling- 
unit-at-a-time  jackknife  (see  Rust  1985)  is  that  the  number  of  needed  replicate  weights  per  sample 
record  is  kept  manageable. 

A  disadvantage  of  the  delete-a-group  jackknife  over  the  delete-one  jackknife  is  that  it  requires 
the  first-phase  stratum  sample  sizes  to  be  large  -  at  least  five  sample  units  per  stratum. 
Otherwise,  the  delete-a-group  jackknife  will  be  overly  conservative;  that  is,  higher,  on  average, 
than  the  true  variance  it  is  measuring.  As  a  result,  when  this  jackknife  is  applied  to  estimators 
from  the  NASS  area  frame,  it  will  be  biased  upward. 

Like  the  delete-one  jackknife,  the  delete-a-group  jackknife  is  a  nearly  unbiased  estimator  of 
variance  only  when  the  first-phase  sampling  fractions  are  small  -  no  more  than  1/5  for  most 
records.  Otherwise,  the  delete-a-group  jackknife  tends  to  be  biased  upward.  This  bias  is  likely 
to  be  ignorable  in  most  NASS  applications.  For  the  1996  VCUS,  however,  it  was  so  great  that 
the  delete-a-group  jackknife  has  to  be  modified.  A  potential  modification  is  discussed  in  the  text. 
It  is  useful,  but  has  a  striking  limitation:  One  set  of  replicate  weights  is  needed  when  estimating 
the  variances  of  totals  and  another  when  estimating  the  variances  of  ratios. 


INTRODUCTION 

This  report  addresses  the  construction  of 
delete-a-group  jackknife  variance  estimators 
for  a  variety  of  estimation  strategies  (an 
estimation  strategy  is  a  sampling  design 
paired  with  an  estimator).  The  emphasis 
will  be  on  computational  formulae,  which 
will  be  rendered  in  fairly  abstract  form. 
Relevant  theoretical  comments  will  be  made 
where  appropriate,  but  most  proofs  are  left 
for  the  appendices. 

The  sampling  designs  with  which  we  will  be 
dealing  may  have  any  number  of  phases.  At 
each  phase,  one  of  the  following  selection 
schemes  is  assumed  to  be  used: 

1)  stratified  simple  random  sampling  without 
replacement, 

2)  systematic  probability  sampling  (usually 
called  systematic  probability  proportional  to 
size  sampling;  here  we  want  to  de-emphasize 
the  “size”  measure), 

3)  the  converse  of  systematic  probability 
sampling  (what  remains  in  a  frame  after  a 
systematic  probability  sample  has  been 
removed),  or 

4)  Poisson  sampling  (in  which  each  element 
is  given  its  own  selection  probability,  and 
the  sampling  of  one  element  has  no  impact 
on  whether  another  gets  selected). 

All  stratum  samples  are  assumed  to  be  large 
(contain  at  least  five  sampling  units). 
Violation  of  this  assumption  in  the  first- 
phase  of  sampling  can  cause  the  delete-a- 
group  jackknife  to  be  biased  upward.  This 
is  shown  in  Appendix  A. 


NASS  currently  incorporates  two  types  of 
calibration  in  its  estimators  and  does  not 
plan  to  use  any  other  types  in  the  near 
future.  “Calibration”  is  a  general  term  for 
a  sampling-weight  adjustment  that  forces  the 
estimates  of  certain  item  totals  based  on  the 
sample  at  one  phase  of  sampling  to  equal  the 
same  totals  based  on  a  previous  phase  or 
frame  (control)  data. 

Ratio  adjustments,  the  most  common  form  of 
calibration,  were  used  repeatedly  in  the  1996 
Agricultural  Resource  Management  Study 
(ARMS).  Restricted  regression,  another 
population  form  of  calibration,  was  used  in 
both  the  1997  Minnesota  pilot  Quarterly 
Agriculture  Survey  (QAS)  and  the  second- 
phase  of  the  1996  Vegetable  Chemical  Use 
Survey  (VCUS).  Only  these  forms  of 
calibration  are  discussed  in  the  text. 

Most  of  the  results  in  this  report  are 
supported  with  randomization-based  (design- 
based)  analyses.  As  a  consequence,  all 
estimators  of  population  parameters  are 
assumed  to  be  randomization  consistent  (i.e., 
have  small  randomization  mean  squared 
errors  and  even  smaller  randomization 
biases).  A  brief  discussion  of  the  model- 
based  properties  of  the  delete-a-group 
jackknife  is  reserved  for  a  separate  section. 

The  concise  term  “variance  estimation”  will 
be  used  throughout  the  text  in  place  of  the 
more  cumbersome  “mean  squared  error 
estimation.”  It  should  be  understood, 
however,  when  the  delete-a-group  jackknife 
is  a  good  estimator  for  the  variance  of  a 
randomization-consistent  estimator,  it  is 
also  a  good  estimator  for  its  mean  squared 
error. 

For  our  purposes,  the  term  “nearly 
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unbiased”  will  mean  that  the  bias  of  the 
estimator  in  question  is  an  ignorably  small 
fraction  of  its  mean  squared  error.  The  term 
“biased”  will  be  used  to  mean  “not 
(necessarily)  nearly  unbiased.” 

When  first-phase  stratum  sample  sizes  are 
large,  the  delete-a-group  jackknife  is 
appropriate  (has  only  a  small  potential  for 
bias)  whenever  the  conventional, 
randomization-based,  delete-one  jackknife  is. 
Kott  and  Stukel  (1997)  have  extended  the 
use  of  the  latter  jackknife  to  two-phase 
estimators  with  calibration  in  the  second 
phase.  This  report  relies  heavily  on  their 
results.  Here,  however,  systematic 
probability  can  be  used  in  the  second  design 
phase,  even  though  Kott  and  Stukel  only 
treated  strategies  featuring  stratified  simple 
random  sampling  in  the  second  phase. 

In  most  applications  of  the  delete-a-group 
jackknife  at  NASS  the  need  for  finite 
population  correction  (fpc)  is  ignored.  One 
section  of  the  text  discusses  a  number  of 
those  applications. 

A  subsequent  section  takes  up  variance 
estimation  of  totals  and  ratios  when  proper 
fpc  is  a  concern.  Strictly  speaking,  the 
variant  of  the  delete-a-group  jackknife  that 
captures  fpc  requires  single-phase  Poisson 
sampling  to  be  nearly  unbiased. 
Nevertheless,  the  practical  application  can  be 
broader,  as  we  shall  see. 

WHY  USE  THE  DELETE-A-GROUP 
JACKKNIFE? 

The  delete-a-group  jackknife  is  simple  to 
compute  once  appropriate  replicate  weights 
are  constructed.  The  so-called  “lineari¬ 
zation”  methods  traditionally  used  by  NASS 


for  estimating  variances  can  be  very 
cumbersome  when  applied  to  estimators 
based  on  multi-phase  designs  like  the  1996 
Vegetable  Chemical  Use  Survey  (VCUS) 
(Hicks  1998)  and  components  of  the  1996 
ARMS  (Kott  and  Fetter  1997).  Estimators 
using  calibrated  weights  based  on  restricted 
regression,  like  those  calculated  for  the  1997 
Minnesota  pilot  Quarterly  Agriculture 
Survey  (QAS),  pose  even  greater  practical 
problems  for  linearization  variance  methods 
(a  multivariate  regression  coefficient  would 
need  to  be  estimated  for  every  item  of 
interest). 

It  is  also  a  relatively  simple  matter  to  apply 
the  delete-a-group  jackknife  to  the 
composite  estimators  associated  with  the 
ARMS.  With  1996  survey  data,  for 
example,  results  from  the  Phase  II  Corn 
Production  Practices  Report  (PPR)  were 
composited  with  results  from  the  Phase  II 
Corn-for-Grain  Production  Practices  and 
Costs  Report  (PPCR).  In  addition,  results 
from  the  Phase  III  Cost  and  Returns  Report 
(CRR)  stand-alone  (based  on  respondents 
that  were  not  in  the  Phase  II  PPCR  sample) 
were  composited  with  results  from  the  Phase 
III  CRR  follow-on  (based  on  respondents 
that  were). 

The  advantage  of  the  delete-a-group 
jackknife  over  the  traditional,  delete-one- 
primary-sampling-unit-at-a-time  jackknife 
(see  Rust  1985)  is  that  the  number  of  needed 
replicate  weights  per  sample  record  is  kept 
manageable.  A  common  practice  with  the 
delete-one  jackknife  for  handling  this 
problem  is  to  group  primary  sampling  units 
(PSU’s)  into  variance  PSU’s.  This  practice 
reduces  the  number  of  replicate  weights 
needed  per  record  -  there  is  one  for  every 
variance  PSU.  Nevertheless,  NASS  would 
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need  at  least  15  replicate  weights  per  record 
to  compute  variances  for  state  estimators. 
This  would  result  in  national  variance 
estimates  employing  several  hundred 
replicate  weights  per  record. 

COMPUTING  A  DELETE-A-GROUP 
JACKKNIFE:  AN  OVERVIEW 

Suppose  we  have  a  sampling  design  with  any 
number  of  phases  and  a  randomization- 
consistent  estimator,  t,  we  wish  to  apply  to 
the  resultant  sample.  To  compute  a  delete-a- 
group  jackknife  variance  estimator  for  t,  we 
first  divide  the  first-phase  sample  -  both 
respondents  and  non-respondents  -  into  R 
(jackknife)  groups.  Currently,  R  is  15  in 
NASS  applications.  Consequently,  we  will 
assume  R  =  15  in  the  text.  By  setting  R  at 
15,  we  lengthen  the  traditional,  normality- 
based  95  %  confidence  interval  by  ten 
percent.  To  see  why  this  is  so,  observe  that 
the  ratio  of  the  t-value  at  0.975  for  a 
Student’s  t  distribution  with  14  degrees  of 
freedom  and  the  normal  z- value  at  0.975  is 
approximately  1.1. 

Suppose  we  have  a  survey  which  may  have 
multiple  phases.  Let  F  be  the  sample 
selected  at  the  first  phase  of  the  sampling 
process.  The  first-phase  sample  units  may 
be  composed  of  distinct  elements  (e.g., 
farms)  or  it  may  consist  of  clusters  of 
elements  (e.g.,  area  segments).  Many 
survey  designs  feature  a  single  phase  of 
sample  selection. 

The  delete-a-group  jackknife  begins  by 
dividing  the  first-phase  sample  F,  into  15 
groups.  This  can  be  done  as  follows:  order 
F  in  an  appropriate  manner  (discussed 
below);  select  the  first,  sixteenth,  thirty- 
first,  ...  units  for  the  first  group;  select  the 


second,  seventeenth,  thirty-second.  ...,  units 
for  the  second  group;  continue  until  all  15 
groups  are  created.  Unless  the  number  of 
units  in  F  is  divisible  by  15  (which  is 
unlikely),  the  groups  will  not  all  be  of  the 
same  size. 

Ordering  in  an  “appropriate  manner” 
depends  on  the  context.  If  F  was  drawn 
using  stratified  random  sampling,  then  order 
the  sample  so  that  units  in  the  same  stratum 
are  listed  together  (i.e.,  contiguously).  If 
samples  were  drawn  using  Poisson  sampling, 
order  the  sample  units  randomly. 

Let  S  denote  the  final  respondent  sample 
used  to  compute  t,  and  let  W;  denote  the 
sampling  weight  for  element  i  in  S.  The 
elements  in  S  may  be  the  same  as  the  sample 
units  in  F  or  they  may  be  a  subsample  of 
those  units.  The  elements  in  S  may  also 
have  a  different  nature  than  the  original 
sample  units  in  F;  for  example,  they  may  be 
farms  as  opposed  to  area  segments  or  fields 
as  opposed  to  farms.  In  all  such  cases, 
however,  each  element  in  S  must  be 
contained  within  an  original  sample  unit  in  F 
in  a  clearly  defined  way.  Let  e,  be  the 
original  sampling  weight  of  the  unit 
containing  i  (which  may  be  i  itself);  that  is, 
ej  is  the  inverse  of  the  unit’s  first-phase 
probability  of  selection. 

Let  Sr  denote  that  part  of  the  final  sample 
originating  in  first-phase  sample  units 
assigned  to  group  r.  The  jackknife  replicate 
S(r)  is  the  whole  final  sample  S  with  Sr 
removed.  We  similarly  define  F(r)  as  the  set 
of  first-phase  sample  units  not  in  r. 

We  need  to  create  15  sets  of  replicate 
weights  {wi(r)},  one  for  each  r,  in  the 
following  manner:  wj(r)  =  0  for  all  elements 
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in  Sr;  for  other  elements,  wi(r)  will  be  close 
to  (15/14)Wj  but  adjusted  to  satisfy 
calibration  constraints  similar  to  those 
satisfied  by  Wj  (exactly  how  to  do  this  in  a 
number  of  situations  is  the  subject  matter  of 
the  following  section).  Observe  that  a 
wi(r)-value  has  been  assigned  to  every 
element  in  S  including  those  in  Sr 

Now  t  is  an  estimate  based  on  the  sample  S 
calculated  using  the  set  of  weights,  {Wj}. 
Let  t(r)  be  the  same  estimate  but  with  the 
member  of  {wi(r)}  replacing  {Wj}.  The 
delete-a-group  jackknife  variance  estimator 
for  t  is 

V;  =  (14/15)  r5  (t(r)  -  t)2.  (1) 

PARTICULAR  CASES  (IGNORING 
FINITE  POPULATION  CORRECTION) 

In  this  section,  we  see  how  the  delete-a- 
group  jackknife  can  be  fruitfully  applied  in 
a  number  of  estimation  strategies  where  fpc 
may  be  ignored;  that  is,  when  the  first-phase 
selection  probabilities  are  all  small  (say  less 
than  or  equal  to  1/5). 

One  sampling  design  not  discussed  in  detail 
subsequently  is  stratified  multi -stage 
sampling,  in  which  subsampling  within  each 
primary  (first-stage)  sampling  unit  is 
conducted  independently  of  subsampling  in 
other  primary  sampling  units.  When  the 
first  stage  of  sampling  has  ignorably  small 
selection  probabilities,  the  conventional 
variance  estimator  for  a  stratified  multi-stage 
sample  looks  exactly  like  that  for  a  stratified 
single-stage  cluster  sample  with  estimated 
totals  for  primary  sampling  units  used  in 
place  of  actual  values.  As  a  result,  when  a 
delete-a-group  jackknife  is  appropriate  for  an 
estimator  based  on  a  stratified  single-stage 


sample,  it  is  appropriate  for  an  estimator 
based  on  a  stratified  multi-stage  sample. 

Stratified  Simple  Random  Sampling 
Suppose  we  have  a  single-phase  stratified 
simple  random  sample  without  any 
nonresponse  (handling  nonresponse  will  be 
discussed  later).  The  original  and  final 
sampling  weight  for  a  unit  i  in  stratum  h  is 
e;  =  Wj  =  Nh/n,„  where  Nh  is  the  population 
size  of  stratum  h  and  n,,  is  its  sample  size. 

Let  us  now  consider  the  r’th  set  of  replicate 
weights.  For  a  unit  i  in  S(r)  and  stratum  h, 
ei(r)  =  (15/14)Nh/iv  By  contrast,  the  appro¬ 
priate  final  r’th  replicate  weight  for  unit  i 
recognizes  the  calibration  equations  inherent 
in  the  direct  expansion  estimator  (i.e.,  Nh  = 
Zjes(r)nhWj(r)  for  all  h).  It  is  wi(r)  =  Nh/nh(r)  = 
(nh  /nh(r))ej,  where  nh(r)  is  the  number  of 
sample  units  in  both  S(r)  and  h.  Observe  that 
ei(r>  =  wi(r)  only  when  nh  is  divisible  by  15. 

Stratified  Systematic  Probability  Sampling 
Suppose  we  have  a  single-phase,  stratified 
systematic  probability  sample.  The  original 
and  final  sampling  weight  for  a  unit  i  in 
stratum  h  is  e,  =  w,  =  Mh  /(iynj),  where  m, 
is  the  measure  of  size  of  unit  i  in  stratum  h, 
Mh  is  the  sum  of  the  nij  across  all  units  in 
stratum  h,  and  r^  is  the  stratum  sample  size. 

Analogous  to  the  simple  random  sampling 
case,  the  appropriate  final  r’th  replicate 
weight  for  element  i  recognizes  the 
calibration  equations  inherent  in  the  Horvitz- 
Thompson  expansion  estimator  (i.e., 

Mh  =  Ejes(r)nhWj(r)mj  for  all  h).  It  is 

wi(r)  =  (nh  /nh(r))ej,  where  nh(r)  is  the  number 
of  sample  units  in  both  S(r)  and  h. 

Stratified  simple  random  sampling  can  be 
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viewed  as  equivalent  to  a  special  case  of 
systematic  probability  sampling  from 
randomly-order  lists  (one  in  which  nti  is 
constant  within  strata).  Appendix  A 
provides  some  theoretical  justification  for 
using  the  delete-a-group  jackknife  as 
described  above  with  a  stratified,  single¬ 
phase  systematic  probability  sampling  design 
under  certain  conditions.  One  of  those 
conditions  is  that  the  systematic  samples  be 
drawn  from  randomly- ordered  lists. 
Variance  estimation  can  be  problematic 
when  systematic  samples  are  drawn  from 
purposefully-ordzizd  lists. 

Purposefully-ordered  lists  can  reduce  the 
variance  in  estimators  based  on  systematic 
samples.  Unfortunately,  the  reduction  in 
variance  due  to  a  well-designed  ordering 
usually  can  not  be  measured  in  an  effective 
manner. 

Restricted  Regression 
(A  Form  of  Calibration) 

There  are  many  versions  of  restricted 
regression.  Below  is  a  description  of  a 
method  similar  to  what  was  used  in  the  1996 
VCUS  and  1997  Minnesota  pilot  QAS.  The 
version  presented  here  will  likely  be  used  in 
the  future. 

Suppose,  for  exposition  purposes,  there  are 
two  sampling  phases.  Suppose  further  that 
the  second  phase  sample  is  calibrated  to  a 
row  vector  of  totals,  r|,  based  on  estimates 
from  the  first-phase  sample  or  determined 
from  the  frame  itself. 

Let  fj  be  the  weight  for  element  j  after  the 
first  phase  of  sampling,  and  let  Pj  be  the 
element’s  selection  probability  in  the  second 
sampling  phase.  In  the  absence  of  non¬ 
response  (again,  nonresponse  will  be  dealt 


with  later)  in  the  second  sampling  phase,  a 
general  form  of  the  calibrated  weight  for  j 
under  restricted  regression  is 
Wj  =  fj  /pj  + 

Ol*  -Z.es*  [fi/pJXi) 

(  Lies*  [fi  /PilXi'X;)'1  [fj  /PjlXj'  (2) 

for  i  6  S*,  and  a  predetermined  value 
otherwise  (chosen  so  that  Wj  is  not  too  small 
or  too  far  from  fj/pj),  where  S  is  the  second- 
phase  sample,  S*  a  subset  containing  almost 
all  the  elements  of  S,  x,  is  a  row  vector  of 
covariates  whose  sum  across  all  elements  in 
the  population  is  either  r)  or  has  been 
previously  estimated  to  be  T)  -  that  is, 
r\  =  where  F  denotes  the  elements  in 

the  first-phase  sample;  finally, 

*1*  =  T)  -Ls-S'WiXj. 

Let  fj(r)  be  the  r’th  jackknife  replicate  weight 
for  unit  j  after  the  first  sampling  phase. 
The  r’th  jackknife  replicate  weight  for 
element  j  is  0  when  jeSr ;  otherwise,  it  is 

Wj(r)  =  Wj[fj(r)/fj]  + 

(’Iw-EieStDWiKw/fJXj)  (3) 

(  EicSwWiffiw/fJXi'Xj)-1  Wj  [fjfrj/fJXj', 

where  q(r)  =  q  when  r|  has  been  determined 
from  frame;  q(r)  =  f^jXj  when  r|  has  been 
estimated  from  the  first-phase  sample. 

Equation  (3)  is  not  the  standard  way  to 
construct  jackknife  replicate  weights.  The 
expression  wk[fk(r)  It J  has  been  used  in  place 
of  the  more  common  fk(r)/pk,  with  which  it  is 
nearly  equal  (because  wk  «  fk/pk).  Equation 
(3)’s  strength  is  that  it  forces  the  replicate 
weights  (for  elements  not  in  group  r)  to  be 
fairly  close  to  the  associated  calibrated 
weights.  This  appears  to  reduce  the  upward 
bias  that  unexpected  differences  between  the 
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two  can  cause.  It  should  be  noted  that  any 
such  upward  bias  is  small;  in  fact,  it  is 
asymptotically  ignorable.  We  live, 
however,  in  a  finite  world. 

Restricted-regression  as  described  above  can 
be  done  at  any  phase  of  sampling.  At  the 
f  th  phase,  f>  in  equation  (2)  becomes  the 
weight  for  element  i  at  the  t-l'th  phase  and 
p,  the  element’s  conditional  selection 
probability  at  the  f  th  phase.  For  a  single¬ 
phase  restricted-regression  estimator,  we  can 
set  all  p,  =  1  in  equation  (2). 

When  the  phase  of  sampling  calibrated  in 
this  manner  contains  more  than  a  single 
stratum,  the  jackknife  can  have  an  upward 
bias  (see  Appendix  B).  In  addition,  for  a 
single-phase  Poisson  sample,  X;A  =  1  must 
hold  for  some  A  (see  the  section  on  Poisson 
sampling  and  Appendix  D). 

Ratio-Adjusted  Weights 
( Another  Form  of  Calibration ) 

Consider,  again,  a  two-phase  sample  with  fj 
and  pj  as  above.  A  very  common  form  of 
calibration  occurs  when  a  vector  of 
co variates  for  element  i,  xi?  is  defined  in 
such  a  way  that  only  one  component  of  the 
vector  is  non-zero  for  each  i.  That  is  to  say, 
the  elements  are  categorized  into  G  mutually 
exclusive  calibration  (or  ratio-adjustment) 
groups,  and  xig  >0  only  when  element  i  is 
in  group  g;  otherwise,  xig  =  0. 

Under  that  structure,  a  ratio-adjusted  weight 
for  an  element  j  in  group  g  is 

Wj  =  Tig  (  £ieS  [f;  /pJXig)'1^  /pj ,  (4) 

and  T]  =  (%,,  ...»  rjG).  Similarly,  the 
corresponding  replicate 'weight  is-0  for 
jeSr ,  and 


Wj(r)  =  Tlg(r)  (  EieS(r)  5(0  /P JX^’1  [fj(r)  /pj]  (5) 

otherwise,  where  =  (r|1(r),  ...,  rjG(r)). 

If  the  second-phase  sample  is  stratified,  and 
more  than  one  of  these  strata  are  contained 
within  a  calibration  group,  then  the  jackknife 
can  have  an  upward  bias  (see  Appendix  B). 

When  the  second-phase  sample  is 
unstratified  or  the  second-phase  strata  and 
ratio-adjustment  groups  coincide,  the  delete- 
a-group  jackknife  is  nearly  unbiased.  In  the 
1996  ARMS  and  1996  VCUS,  secondhand 
later-)phase  sampling  was  unstratified. 

Extensions  of  these  results  to  estimation 
strategies  with  t  >  2  phases  are  straight¬ 
forward;  the  fj  in  equation  (4)  and  %)  in 
equation  (5)  become  the  weight  and  replicate 
weight  at  the  t-l’th  phase.  For  a  single¬ 
phase  sample,  we  can  set  all  the  pj  equal  to 
1  in  both  equations  (4)  and  (5). 

The  establishment  of  the  appropriateness  of 
the  delete-a-group  jackknife  for  ratio- 
adjusted  estimators  parallels  that  of 
restricted-regression  estimators,  which  is 
outlined  in  Appendix  B. 

NASS  Applications  of  Ratio-Adjusted 
Weighting 

One  way  to  handle  nonresponse  is  to  treat 
the  set  of  responding  elements  (at  any  phase 
of  the  design)  as  a  stratified  simple  random 
subsample  of  the  selected  sample.  This  was 
essentially  what  was  done  in  the  first-phase 
of  the  1996  VCUS.  All  the  original  sample 
elements  (respondents  and  nonrespondents) 
were  assigned  to  jackknife  replicates,  and 
nonresponse  was  treated  as  a  second  phase 
-  of  sampling.  The  “second-phase”  strata 
and  calibration  groups-  coincided- with  the 
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original  stratum  definitions,  and  xig  was  set 
equal  to  1  when  i  was  in  group  g  (0, 
otherwise).  Since  fj  was  equal  for  all  i  in  the 
same  stratum,  and  p,  was  likewise  identical 
for  each  respondent  i  in  the  stratum,  w:  in 
equation  (4)  collapsed  to  the  population  size 
in  the  stratum  containing  i  divided  by  the 
respondent  sample  size  in  that  stratum. 
Equation  (5)  collapsed  similarly. 

In  the  1996  ARMS,  a  stratified  simple 
random  screening  sample  of  farms  was 
subsampled  sequentially  for  several  mutually 
exclusive  survey  modules  (see  Kott  and 
Fetter  1997).  Farms  were  selected  for  the 
Phase  II  Soybean  PPR  in  Nebraska,  for 
example,  using  an  additional  five  phases  of 
sampling  (to  be  selected  for  this  module,  a 
respondent  farm  from  the  screening  sampled 
had  to  avoid  being  subsampled  for  one  of  the 
four  modules  preceding  it).  Each  of  these 
phases  employed  unstratified  systematic 
probability  sampling  from  a  purposefully- 
ordered  list  (the  theory  in  Appendix  B  is 
assumes  a  randomly-ordered  list;  if 
anything,  purposeful  ordering  should  reduce 
mean  squared  errors  and  contribute  an 
upward  bias  to  the  delete-a-group  jackknife). 
Finally,  a  field  was  randomly  selected  from 
each  sampled  soybean  farm. 

The  separate-ratio  estimator  in  equation  (4) 
was  used  twice  in  Phase  II  Soybean  PPR 
estimates.  It  was  used  to  ratio  adjust  the 
weights  for  the  screening-sample 
respondents  to  the  frame  total-value-of-sales 
within  every  screening  stratum  (notice  that 
response/nonresponse  on  the  screening 
survey  is  treated  here  implicitly  as  another 
phase  of  sampling).  In  addition,  the  soybean 
field  sample  was  divided  into  three  size 
groups.  Here,  rjg  was  the  total  soybean 
acres  in  calibration  (size)  group  g  as 


estimated  from  the  screening  sample  with  the 
weights  described  above,  and  p;  was  the 
product  of  the  six  (conditional)  probabilities: 
the  probabilities  that  the  farm  containing 
field  i  was  not  selected  for  the  four  modules 
preceding  soybeans,  the  probability  that  this 
farm  was  selected  for  the  production 
practices  module,  and  the  probability  that 
field  i  was  subsampled  from  the  farm. 

We  treated  the  fields  from  which  we 
collected  Phase  II  PPR  information  as  if 
they  were  a  stratified  simple  random 
subsample  of  the  selected  fields,  where  the 
three  calibration  groups  served  as  strata. 
This  had  no  practical  effect  on  the 
calculation  of  the  p;  (observe  that  if  all  the  p, 
in  a  group  are  multiplied  by  the  same  factor, 
the  computed  weights  in  equations  (4)  and 
(5)  are  unchanged). 

Composite  Estimators 
Consider  a  set  of  C  distinct  samples,  each  of 
which  can  be  used  to  estimate  a  common 
target  value.  Let  S  denote  the  combined 
sample,  and  Wj(c)  denote  the  weight  for 
element  i  in  original  sample  c.  If  i  is  not  in 
sample  c,  set  Wj(c)  =  0.  A  composite 
estimator  t  uses  the  set  of  weights  {w;}, 
where  each  w,  =  £c  AcWj(c)  and  £  A.c  =  1 . 

To  estimate  the  variance  of  t,  we  can  create 
15  sets  of  replicate  weights  for  every  W;(c) 
and  denote  each  by  {wi(r)(c)}.  We  then 
estimate  the  t(r)  using  wi(r)  =  £c  A.cwi(r)(c)  and 
compute  Vj  using  equation  (1). 

Composite  estimation  was  used,  for 
example,  to  combine  the  Phase  III  Beef  and 
Corn-for-Grain  CRR  follow-on  samples  in 
the  1996  ARMS  with  the  Phase  III  CRR 
stand-alone  sample.  First  the  two  enterprise 
CRR  samples  were  composited  and  then  this 
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combined  sample  was  composited  with  the 
other  CRR  sample  (see  Kott  and  Fetter 
1997). 

Samples  being  combined  need  not 
correspond  to  identical  target  populations. 
For  example,  the  population  of  list  farms 
with  com  for  grain  in  1996  is  not  the  same 
as  the  population  of  list  farms  with  ten 
weaned  calves  in  1996  (the  Beef  CRR 
population).  When  combining  CRR 
samples,  we  also  combine  target 
populations;  in  this  case,  to  the  set  of  all  list 
farms  with  either  grain  corn  or  ten  weaned 
calves  in  1996.  Only  those  sample  farms 
having  both  com  for  grain  and  at  least  10 
weaned  calves  are  assigned  composite 
weights  as  described  above.  Other  farms  in 
the  combined  sample  retain  their  pre¬ 
composite  weights. 

Appendix  C  shows  why  the  delete-a-group 
jackknife  works  for  the  composite  estimators 
used  in  the  ARMS  in  which  the  components 
were  separate  modules  based  on  the  same 
screening  sampling.  Composite  estimation 
was  also  used  in  the  ARMS  to  combine 
independently  drawn  samples  like  the  Phase 
II  Soybean  PPR  sample  and  the  National 
Resource  Inventory  sample.  Here,  like  a 
conventional  jackknife,  when  the  delete-a- 
group  jackknife  is  appropriate  for  each 
independent  component,  it  is  also 
appropriate  for  any  linear  combination  of  the 
components. 

SINGLE-PHASE  POISSON 
SAMPLING  AND 

FINITE  POPULATION  CORRECTION 

In  this  section,  we  restrict  our  attention  -  at 
first  -  to  a  single-phase  Poisson  sample  of 
elements.  Let  be  the  selection  probability 


of  element  j.  We  assume  there  is  no 
nonresponse. 

The  versions  of  the  delete-a-group  jackknife 
developed  in  this  section  will  contain  finite 
population  corrections.  The  versions  are 
different  for  an  estimator  of  a  total  and  the 
estimator  of  a  ratio.  This  is  a  reflection  of 
the  fact  that  a  simple  formula  like  equation 

(1)  does  NOT  work  for  all  smooth 

transformations  of  calibrated  expansion 
estimators  when  finite  population  correction 
is  an  issue  (note:  a  “smooth” 

transformation  has  first,  second,  and  third 
derivatives;  most  statistics  of  interest  are 
smooth  transformations  of  expansion 
estimations,  the  major  exception  being 
percentiles). 

A  Calibrated  Estimator  for  a  Total 
Suppose  we  have  a  calibrated  estimator  for 
a  total,  t  =  £s  Wjyj,  where 

Wj  =  1/TCj  + 

0l*  -  lies*  [1/nJXi)  (6) 

(  lies*  [l/TtijXi'Xj)-1  [l/7Cj]Xj' 

for  j  e  S*,  and  a  predetermined  value 
otherwise  (chosen  so  that  Wj  ^  1  and, 
perhaps,  not  too  far  from  l/7ij),  S  is  the 
sample,  S*  a  subset  containing  almost  all  the 
elements  of  S,  X;  is  a  row  vector  of 
covariates  whose  sum  across  all  elements  in 
the  population  is  rj,  and  r\*  = 
ri  -  X!s-s*  wixi-  There  must  also  be  a  vector  A 
such  that  XjA  =  \/(l  -  Ttj)  for  all  j  (that  is  to 
say,  either  a  component  of  Xj  or  a  linear 
combination  of  components  must  equal 
v/(l  -  TCj)) .  Since  we  are  dealing  with  a 
single-phase  sample,  (6)  is  simply  equation 

(2)  with  l/7tk  replacing  fk  /pk  (i.e.,  fk  in 
equation  (2)  is  1,  while  pk  is  7tk). 
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To  estimate  the  variance  of  t,  we  use 
equation  (1)  but  replace  t  with  t(v)  = 

Is  w/v)yj,  and  t(r)  with  t(r)(v)  =£s  wJ(r)(v)yj5 
where 

w/v)  =  w/(l  -  1/Wj),  (7) 

and 

Wj(r)(V>  =  WJ(V){1  +  (Is  Wi(V)Xi  -  Is(r)  Wi(v)Xi) 

(IscDWi^Xi’Xi)-1^’}  (8) 

when  j  e  S(r)  and  0  otherwise.  Appendix  D 
outlines  why  this  works. 

Observe  that  Wj(v)  ~  Wj\/(1  -  n}),  so  that 
Wj(v)  =  Wj  when  the  selection  probability  for 
element  j  is  ignorably  small.  When  all 
element  selection  probabilities  are  very 
small,  there  is  little  difference  between  this 
delete-a-group  jackknife  for  a  total  estimator 
with  finite  population  correction,  vJ(fi)cT),  and 
the  standard  delete-a-group  jackknife,  Vj. 
Moreover,  the  rather  odd  assumption  that 
there  exists  a  X  such  that  x}X  =  v/(l  -  Ttj) 
becomes  close  to  the  more  standard 
assumption  that  either  a  component  of  Xj  or 
a  linear  combination  of  components  is  a 
constant  (i.e.,  x^X  =  1  for  some  A,). 

In  fact,  if  we  were  to  ignore  finite 
population  correction  (which  we  can  do  for 
most  surveys,  but  not  VCUS),  we  could 
estimate  the  variance  of  t  with  equation  (1), 
replacing  equation  (8)  with 
Wj(r)  =  Wj{1  +  (Is  WiXj  -  Xs(r)  WjXj) 

(IswWiXi’Xi)-1^'}  (8') 

when  j  e  S(r)  and  0  otherwise  as  long  as 
XjA,  =  I  for  some  X.  This  is  what  we  did  for 
the  1997  Minnesota  pilot  QAS  (see  Bailey 
and  Kott  1977). 


An  Estimator  for  a  Ratio 
Suppose  tR  is  an  estimator  for  a  ratio  of  the 
form,  tR  =  Is  w^  /£s  WjZj  ,  where  Wj  is 
calibrated  as  above.  One  can  estimate  the 
variance  of  t  with 

Vj(fpcR)  =  (IsW/'VIsWjZj)2 

(14/15)!15  (tR(r)(v)  -  tR(v))2,  (9) 

where  tR(v)  =  Is  w/v)yj/Is  w/%  and 
tR(r)(v)  =  Is  wj(r)(v)yj  /Is  wj(r)(v)Zj  •  This  assumes 
XjA,  =  \/(l  -  7tj)  for  some  X.  Even  without 
this  assumption  holding,  in  fact,  even 
without  calibration,  vJ(fpcR)  will  likely  be  a 
reasonable  variance  estimator;  as  we  shall 
see. 

Alternatively,  we  could  estimate  the  variance 
of  tR  ignoring  finite  population  correction 
with  equation  (1).  We  need  not  assume  that 
x}X  =  1  for  some  X.  In  fact,  the  w,  need  not 
even  be  calibrated  in  this  case  (to  see  why, 
observe  that  multiplying  all  the  weights  in  tR 
by  a  fixed  constant  so  that  £s  Wj  equals  the 
population  size  has  no  effect  on  the 
estimator;  consequently,  all  ratio  estimators 
are  effectively  calibrated  on  Xj  =  1). 

Some  Explanations  and  Extensions 
Consider  a  single-phase  element  sample  that 
is  not  necessarily  Poisson.  Suppose  we  wish 
to  estimate  the  variance  of  t  =  Is  w^j, 
where  the  Wj  satisfy  equation  (6).  Let 
uk  =  yk  -  xk(Iu  x’Xij-’Iu  Xj'yj,  where  U 
denotes  the  population.  The  variance  of  t  is 
approximately 

V  =  IuUk2d  -^k)/7lk  + 

Iu(k»i)  UkUi(^ki  “  (10) 

Under  Poisson  sampling  the  joint  selection 
probability  of  k  and  i,  7tki,  is  equal  to  the 
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product  7rk7ti,  and  so  V  collapses  to 
Xuuk2(l  -  7tk)/7tk.  This  can,  in  principle,  be 
estimated  by  £s  uk2(l  -  Tck)/rck2  ,  which  is 
approximately  equal  to 
IsKUkAl  -  l/wk)]2  ,  which  is  what  vJ(fpcT) 
estimates  (see  Appendix  D). 

A  similar  argument  can  be  made  for  the  ratio 
estimator,  tR  =  Es  w^j  /£s  WjZj5  except  that 
now'  V  becomes  approximately 

v*  =  (EsWjZj)-2Eu(uk*)2(l  -  7tk)/7tk  + 

Eu(k*i)  uk3"Ui*('Itki  "  TCkTCi)/(TCk'JTi)]  » 

where 

uk*  =  Uk+  -xk(XPxi,xi)-1XpXi'ui+,  and 

uk+  =  yk-  (Euyi/Zuz,)zk- 

Under  Poisson  sampling,  V’  collapses  to 
(Is  WjZj)-2[Eu  (uk*)2(l  -  7tk)/7ik].  If  we  tried 
to  compute  a  delete-a-group  jackknife  with 
equation  (1)  replacing  t  by  tR(v)  and  t(r)  by 
tR(r)(v),  w'e  would  get  a  reasonable  estimator 
for  (Es  Wj(v)Zj)’2[Eu  (uk*)2(l  -  7CJ/7UJ  rather 
than  V*,  hence  the  factor  (£s  Wj(v)Zj  /Es  WjZj)2 
on  the  right  hand  side  of  equation  (9). 

This  factor  is  unnecessary  if  finite 
population  correction  is  ignored.  In  fact, 
since  Eu  ui+  =  0  (simplifying  the  proof  in 
Appendix  D),  the  weights  need  not  be 
calibrated  for  the  delete-a-group  jackknife 
variance  estimator  for  tR  to  be  nearly 
unbiased. 

Calibrated  estimators  of  totals  were 
computed  in  the  1997  Minnesota  pilot  QAS. 
Sampling  was  not  exactly  Poisson  due  to  the 
need  to  combine  some  samples  and 
subsample  others  (see  Bailey  and  Kott 
1997).  Nevertheless,  it  is  not  unreasonable 
to  assume  that  Euoc*i)ukU,(^kl  -  7tk7ti)/(TUk7Ti)  in 
the  right  hand  side  of  equation  (10)  is 


roughly  zero  and  then  -  ignoring  finite 
population  correction  -  use  v;  to  estimate 
variances. 

It  is  of  interest  to  note  that  for  systematic 
probability  sampling  from  an  purposefully- 
ordered  list,  7iki  will  often  be  zero  when  i 
and  k  are  listed  sequentially  in  the  ordering. 

If  uk  and  U;  in  equation  (10)  tend  to  have  the 
same  sign  when  i  and  k  are  listed  together, 
then  it  is  likely  that 

Eu(k.i)  uku,(7tki  -  7ik7Ti)/(7Uk7ti)  will  be  nega¬ 
tive  -  reducing  the  variance  of  t.  The  delete- 
a-group  jackknife  does  not  capture  this 
variance-reducing  phenomenon,  however. 
That  is  why  it  was  claimed  earlier  that  the 
use  of  systematic  unequal  probability 
sampling  from  an  ordered  list  will,  if 
anything,  bias  the  delete-a-group  jackknife 
upward.  This  presupposes  that  elements  (or 
units)  listed  together  in  the  ordering  are  in 
some  sense  similar. 

Remember  the  delete-a-group  jackknife  for 
an  estimated  total  with  finite  population 
correction,  vJ(<i)cT),  is  only  appropriate  when 
there  is  no  nonresponse.  Still,  computing 
vJ(fpcT)  and  Vj  using  imputed  values  in  place 
of  real  ones  can  provide  a  means  of 
evaluating  the  impact  of  high  selection 
probabilities  on  variance.  There  is  one 
additional  caveat.  When  one  does  not 
require  there  exists  a  A  such  that  XjA  = 

\/(l  -  Hj)  for  all  j,  then  vJ(fpcX)  may  be  biased 
downward.  This  possibility  is  likely  to  be 
remote  in  practice  (see  Appendix  D). 

We  could  have  used  vJ(fpcR)  to  estimate 
variances  from  the  1996  VCUS.  In  theory, 
this  might  not  be  appropriate  since  the 
calibration  in  that  survey  was  to  first-phase 
totals  rather  than  to  control  totals  (see  Hicks 
1998).  Moreover,  we  did  not  require  that 
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there  be  a  vector  X  such  that  x}X  =  7(1  -  7tj) 
for  all  j.  It  is  unlikely  that  either  failing 
would  cause  much  bias  in  vJ(fpcR).  This  is 
because  calibration  does  little  to  reduce  the 
variance  of  t  in  the  VCUS  (see  Hicks  1998). 
Moreover,  it  is  likely  that  £s  wAHij*  (or 
£s  Wi(v)Ui+  when  vJ(fpcR)  is  used  with  an  un¬ 
calibrated  VCUS  estimator)  is  close  to  zero 
even  when  there  is  no  X  such  that 
XjX  =  V(1  -  7ij).  Appendix  D  explains  why 
£s  wVhij*  must  be  near  zero. 

SUMMARY  OF  NASS  USES  (SO  FAR) 

The  delete-a-group  jackknife  was  used  to 
estimate  variances  for  the  1996  ARMS,  1996 
VCUS,  and  1997  Minnesota  pilot  QAS.  It 
has  also  been  used  in  some  of  NASS’s 
foreign  consulting  work,  but  that  is  beyond 
the  scope  of  this  report. 

Bailey  and  Kott  (1997)  describes  the  sample 
design  used  in  the  Minnesota  pilot  QAS. 
Since  NASS  imputes  for  nonresponse  on  the 
QAS,  there  was  essentially  a  single-phase 
sample  in  Minnesota.  Equation  (2),  with  all 
Pj  set  equal  to  1  and  fj  computed  as 
described  in  Bailey  and  Kott,  was  used  to 
generate  most  of  the  calibration  weights. 
The  vector  Xj  had  20  components  including 
a  constant  term. 

When  a  Wj  calculated  with  equation  (2) 
would  have  been  less  than  1,  farm  j  was 
removed  from  S*,  and  Wj  was  set  equal  to  1. 
Sampled  farms  were  randomly  assigned  to 
jackknife  groups,  and  replicate  weights  were 
calculated  using  the  more  conventional 

Wj(r)  =  [15/14]fj  + 

0l(r)-Z,£S(r>[  15/14]fjXl) 

(  LeS(r)  [15/14]fiXi’Xi)"1  [15/14]fjXj’ 


=  fj  +  (n(r)  ~  LeS(r)  W 

(  Z.eS(r)  f.X.’x,)-1  fjX/ 

instead  of  equation  (3).  This  was  because 
the  advantages  of  using  the  latter  equation 
was  not  clear  at  the  time. 

The  1996  VCUS  had  a  two-phase  design. 
The  first  phase  was  stratified  simple  random 
sampling.  Sampled  units  were  randomly 
assigned  to  jackknife  groups  within  first- 
phase  strata.  Nonresponse  to  the  first  design 
phase  of  the  VCUS  was  treated  as  an 
additional  phase  of  stratified  simple  random 
sampling  where  the  strata  were  the  same  as 
the  first-phase. 

The  second  design  phase  of  the  VCUS  used 
systematic  unequal  probability  sampling. 
Nonresponse  to  this  phase  was  treated  as  an 
additional  phase  of  simple  random  sample. 
Equation  (2)  was  used  to  compute  calibrated 
weights.  The  “first-phase”  weight,  fj,  was 
actually  the  population  size  of  the  first-phase 
stratum  containing  i  divided  by  the  number 
of  first-phase  usables  in  the  stratum;  q  was 
a  vector  of  estimated  planted-acre  totals  for 
in-scope  vegetables  based  on  the  first-phase 
sample  adjusted  for  nonresponse;  p,  was  the 
second-phase  probability  of  selection 
multiplied  by  the  number  of  second-phase 
usables  and  divided  by  the  number  of 
second-phase  sample  farms.  Replicate 
weights  were  computed  using 

wj(r)  =  Wj  +  (T|(r)  -  £ieS(r)  WjXi) 

(  XieS(r)  WjXi'Xj)*1  WjXj', 

which  turned  out  to  have  slightly  better 
empirical  properties  (less  negative  values; 
values  closer  to  Wj)  in  this  context  than  those 
produced  by  equation  (3)  for  some  reason 
(recall  that  Hicks  [1998]  shows  that 
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calibration  had  little  effect  here). 

The  many  uses  of  ratio-adjustment  and 
composite  estimation  in  list-based  estimates 
from  the  1996  ARMS  are  discussed 
thoroughly  in  Kott  and  Fetter  (1997).  The 
original  screening  sample  was  randomly 
allocated  into  jackknife  groups  on  a  stratum- 
by-stratum  basis.  The  text  provides  some 
details  for  a  couple  of  examples.  See  Kott 
and  Fetter  for  more. 

The  delete-a-group  jackknife  was  also  used 
for  die  non-overlap  (area)  portion  of  the 
1996  ARMS  for  economic  statistics.  The 
area  design  had  effectively  three-phases:  1) 
a  stratified  simple  random  sample  of  area 
segments,  2)  a  restratified  simple  random 
subsample  of  farms;  and  3)  a  stratified 
(using  the  first-phase  strata)  simple  random 
subsample  of  respondents.  Using  the  delete- 
a-group  jackknife  in  this  context  treats  the 
three-phase  sample  as  if  it  were  a  three-stage 
sample.  As  a  result,  the  variance  estimator 
can  be  biased  upward  (see  Kott  1990).  The 
problem  here  is  that  the  second-phase  sample 
is  not  calibrated  in  any  way. 

There  is  an  additional  source  of  upward  bias 
in  the  delete-a-group  jackknife  applied  to  the 
1996  ARMS  non-overlap  sample.  Some 
area  substrata  have  very  small  samples  sizes 
(less  than  five  areas  segments).  Collapsing 
substrata  into  land-use  strata  helped  some, 
but  on  occasion  even  land-use  strata  had 
small  sample  sizes.  Appendix  A  shows  why 
this  can  cause  a  bias  in  the  delete-a-group 
jackknife. 


A  DIGRESSION  ON  MODEL-BASED 
INFERENCE 

The  delete-a-group  jackknife  can  be  applied 
to  estimate  variances  in  a  reasonable  fashion 
under  a  variety  of  complex  estimation 
strategies.  Both  the  text  so  far  and  the 
appendices  rely  exclusively  on  the  principles 
of  randomization-based  inference.  As  a 
result  of  this,  we  were  forced  to  assume  two 
number  of  questionable  or  erroneous 
assumptions: 

1)  systematic  probability  sampling  is 
conducted  by  NASS  from  randomly-ordered 
lists,  and 

2)  farm  in  the  same  ratio-adjustment 
(calibration)  group  are  equally  likely  to 
respond  to  a  survey. 

These  assumptions  would  not  be  necessary  if 
we  replace  them  by  the  model  assumptions 
behind  calibration;  namely; 

Yi  =  XiP  +  6;, 

where  the  e-  have  zero  mean,  bounded 
variances,  and  are  uncorrelated  -  at  least 
across  first-phase  sampling  units. 

For  example,  consider  the  difference 
between  the  calibration  estimator, 
t  =  Es  w^,  and  its  target,  T  =  Eu  Yi : 

t-T  =  Es  WiYi-EuYi  =  Es  wiei  ~  Eu  es 

=  Eu(WiIi-  1)6,, 

where  I;  =  1  when  i  e  S,  and  I,  =  0 
otherwise.  Now 
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(t-T)2  =  + 

Eu(i.k)  (Wit  -  IXwJk  - 


If  e,  and  ek  are  uncorrelated,  the  model 
variance  of  t  as  an  estimator  for  T  is 

Ee[(t-T)2]  =  Zufwfli  -2w,I,  +  1)E(€,2) 

=  Is  (Wi2  -  Wi)E(€,2)  - 

IIsWiECei^-IuEfe,2)] 

no  matter  what  the  sampling  design. 
Moreover,  if  E(6j2)  =  XjY  for  some  y,  then 
the  model  variance  of  t  as  an  estimator  for  T 
collapses  further  to  Es(wi2  -  W;)E(e2),  which 
is  what  the  delete-a-group  jackknife  for  a 
total  with  finite  population  correction 
estimates.  (Note:  even  if  E(e2)  does  not 
equal  Xjy  for  some  y,  we  know  that 
Es  wi  E(€i2)  «  Esd/^i)E(ei2)  *  Eu  E(e 2)  for 
randomization-based  reasons). 

Similar  arguments  can  be  made  for 
calibration  in  the  second  (or  later)  phase  of 
sampling.  Kott  (1997)  contains  a  treatment 
of  this  topic  for  the  conventional  delete-one- 
primary-sampling-unit  jackknife.  The 
interested  reader  may  also  want  to  look  at 
the  expression  for  Var2d  in  Appendix  B  and 
replace  each  uk  with  ek.  Similar  sub¬ 
stitutions  can  profitably  be  made  in 
Appendices  C  and  D  as  well. 

In  the  real  world,  models  fail,  which  is  the 
reason  NASS  insists  on  using  randomization- 
consistent  estimators  where  possible.  The 
impact  of  model  failure  is  typically  greater 
on  bias  than  on  variance.  This  is  because 
model  failure  is  usually  small  and  subtle  but 
can  nonetheless  lead  to  a  bias  in  a  non¬ 
randomization-consistent  estimator  that  is 
not  asymptotically  ignorable.  Once  the 
potential  for  asymptotic  bias  is  removed  by 
using  a  randomization-consistent  estimator, 
a  model  can  often  be  safely  invoked  to 
estimate  variance. 


The  situation  can  be  reversed  when  ratio- 
adjustment  is  used  (in  part)  to  handle 
nonresponse  as  in  the  1996  ARMS  and 
VCUS.  The  model  assumption  that  the 
expected  value  for  an  unknown  y-value  is  a 
fixed  multiple  of  a  known  x-value  within  a 
ratio-adjustment  group  is  usually  more 
reasonable  than  the  quasi-randomization 
assumption  that  all  farms  in  the  group  are 
equally  likely  to  be  survey  respondents.  In 
this  situation,  the  assumption  of  the  linear 
model  offers  some  protection  against  a 
systematic  bias  in  an  estimated  value  due  to 
the  failure  of  the  quasi-randomization 
assumption. 

CONCLUDING  REMARKS 

The  delete-a-group  jackknife  is  remarkably 
simple  to  compute  once  appropriate  replicate 
weights  are  determined.  We  have  seen  how 
this  variance  estimation  method  can  be 
meaningfully  applied  to  a  number  of 
complex  estimation  strategies.  These 
include  the  1996  ARMS  (with  multiple 
phases  and  ratio  adjustments),  the  1997 
Minnesota  pilot  QAS  (restricted  regression 
and  Poisson  sampling),  and  the  1996  VCUS 
(two  phases,  calibration  of  the  second  phase 
to  the  first,  and  finite  population  correction 
problems). 

Like  any  variance  estimator,  the  delete-a- 
group  jackknife  is  not  necessarily  nearly 
unbiased  when  any  phase  of  the  sample  is 
drawn  systematically  from  a  purposefully- 
ordered  list  (as  is  the  case  in  latter  phases  of 
the  ARMS).  If  anything,  however,  the 
delete-a-group  jackknife  will  usually  be 
conservative  (biased  upward)  in  this 
circumstance. 

In  addition,  the  delete-a-group  jackknife 


13 


requires  the  following  to  be  nearly  unbiased: 

1)  results  from  each  phase  of  a  survey  - 
including  the  first  phase  -  be  calibrated  for 
some  key  items  of  interest  on  results  from 
either  an  earlier  same  phase  or  the  frame 
(for  example,  the  estimated  number  of  farm 
names  on  the  list  frame  is  often  forced  to 
equal  the  actual  number  of  farm  names  on 
the  list  frame);  and 

2)  the  sample  size  of  at  every  follow-on 
phase  of  a  non-nested  (not  multi-stage) 
multi-phase  design  be  large  (contain  at  least 
five  sample  units  per  stratum  at  that  phase). 

These  are  not  difficult  requirements,  and 
NASS  need  keep  them  in  mind  when 
developing  estimators  in  the  future. 

A  disadvantage  of  the  delete-a-group 
jackknife  over  potential  competitors  is  that  it 
requires  the  first-phase  stratum  sample  sizes 
to  be  large  (at  least  five  sample  units  per 
stratum).  Otherwise,  the  delete-a-group 
jackknife  can  be  overly  conservative.  As  a 
result,  when  this  jackknife  is  applied  to 
estimators  from  the  NASS  area  frame  -  as  it 
was  with  the  non-overlap  component  of  the 
1996  Phase  III  CRR,  it  has  an  upward  bias. 
NASS  needs  to  assess  how  big  a  problem 
this  constitutes  in  practice. 

REFERENCES 

Bailey,  J.T.  and  Kott,  P.S.  (1997).  An 
Application  of  Multiple  List  Frame  Sampling 
for  Multi-Purpose  Surveys.  ASA 


Proceedings  of  the  Section  on  Survey 
Research  Methods ,  forthcoming. 

Harley,  H.O.,  and  Rao,  J.N.K.  (1962). 
Sampling  with  Unequal  Probabilities  and 
Without  Replacement.  The  Annals  of 
Mathematical  Statistics,  350-74. 

Hicks,  S.D.  (1998).  An  Evaluation  of  the 
Sample  Design  and  Estimation  Strategy  Used 
for  the  1996  Vegetable  Chemical  Use 
Survey. 

Kott,  P.S.  (1990).  Variance  Estimation 
when  a  First  Phase  Area  Sample  is 
Restratified.  Survey  Methodology ,  99-104. 

Kott,  P.S.  (1997).  A  (Partially)  Model- 
based  Look  at  Jackknife  Variance  Estimation 
with  Two-Phase  Samples,  Statistical  Society 
of  Canada  Proceeding  of  the  Survey  Methods 
Section,  forthcoming. 

Kott,  P.S.  and  Fetter,  M.  (1997).  A  Multi- 
Phase  Design  to  Co-ordinate  Surveys  and 
Limit  Response  Burden.  ASA  Proceedings 
of  the  Section  on  Survey  Research  Methods, 
forthcoming. 

Kott,  P.S.  and  Stukel,  D.M.  (1997).  Can 
the  Jackknife  Be  Used  With  a  Two-Phase 
Sample?  Survey  Methodology,  forthcoming. 

Rust,  Keith  (1985).  Variance  Estimation  for 
Complex  Estimators  in  Sample  Surveys. 
Journal  of  Official  Statistics,  1,  381-397. 


14 


APPENDIX  A:  Justifying  the  Delete-a-Group  Jackknife 
Under  a  Single-Phase,  Stratified  Sampling  Design 

Suppose  we  have  a  probability  sample  design  with  H  strata  and  nh  sampled  units  within  each 
stratum  h.  Let  us  assume  that  the  sample  was  selected  without  replacement  but  the  selection 
probabilities  are  all  so  small,  and  the  joint  selection  probabilities  are  such,  that  using  the  with- 
replacement  variance  estimator  is  appropriate  (this  rules  out  systematic  sampling  from  a 
purposefully-ordered  lists).  In  particular,  let  us  assume  that  the  estimator  itself  can  be  written 
in  the  form: 

H  nh 

t  =  E  E  v 

h= 1 j= 1 

nh 

Let  %  =  thj  -  Ethg/tV  The  randomization  variance  of  t  is  Var(t)  =  ]TH  Var(£  th+),  where 
th+  =  Ej  thj-  Now  Var(th+)  can  be  estimated  in  a  (nearly)  unbiased  fashion  by 


nh 

var(th+)  =  (nh/[nh  -  1])  E  %2 

j  =  l 


(“nearly”  because  we  are  ignoring  finite  population  correction). 


In  order  to  estimate  Var(t)  with  a  delete-a-group  jackknife  as  suggested  in  the  text,  we  first  order 
the  strata  in  some  fashion  and  then  order  the  units  within  each  stratum  randomly.  The  sample 
is  partitioned  into  R  (i.e.,  15)  systematic  samples  using  the  resulting  ordered  list.  Let  Sr  denote 
one  such  systematic  sample,  Shr  the  set  of  nhr  units  in  both  Sr  and  stratum  h,  and  Sh(r)  the  set  of 
nh(r)  units  in  stratum  h  and  not  in  r. 

The  jackknife  replicate  estimator  t(r)  is 


Now 


H 

t(D  =  E  (nh/nh(r))  E  thj. 
h=l  j^Sh(r) 

H 

t(r)  -  t  =  E  t(nh  /nh(r))  E  hy  -th  +  ]. 
h=l  jeSh(r) 


Treating  each  Sh(r)  as  a  simple  random  subsample  of  the  sample  in  stratum  h,  we  have 


H 

EiKtfr)  -  t)2]  =  E  Var2([nh  /nh(r)]  E  thj) 

j^Sh(r) 
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H  nh 

=  E  (nh2/nh(r))[l  -  (nh(r)/nh)]  E  qhj2/(nh  -  1) 

H  nh 

=  E  (nh/[nh  -  l])(nhr  /nh(r))  E  q^2 

H 

=  E  (nhr  /%!•))  var(th+), 

where  Eo  denotes  expectation  with  respect  to  the  subsampling. 

Observe  that  for  strata  where  nh  <  R,  nhr  /nh(r)  is  either  zero  because  there  are  no  units  in  both 
r  and  h  or  nhr  /nh(r)  is  1/(1^  -  1)  because  there  is  one  unit  in  both  r  and  h.  Since  the  latter 
situation  occurs  in  exactly  nh  replicates,  ERnhr/nh(r)  =  nh/(nh  -  !)• 

For  strata  where  nh  ^  R,  nhr/nh(r)  =  0(1/R)  and  ERnhr/nh(r)  -  1  +  0(1/R).  (Technical  note: 
z  =  0(1/R)  means  limR^R|z|  is  a  constant).  In  fact,  when  r^/R  is  an  integer,  nhr/nh(r)  exactly 
equals  1/(R  -  1),  and  ER  nhr  /%:■)  =  R/[R  -  1]  • 

Since  Var(t)  can  itself  be  estimated  in  an  approximately  unbiased  fashion  by 
var(t)  =  Eh  (nh  /[nh  —  1])  Ej  it  is  not  difficult  to  see  that  the  delete-a-group  jackknife 
variance  estimator,  Vj  =  ([R  -  1]/R)  ER(t(r>  ~  t)2  is  approximately  unbiased  for  var(t)  and  thus  for 
Var(t)  when  all  strata  are  such  that  nh  ^  R  and  is  biased  upward  otherwise.  Moreover,  the 
relative  upward  bias  is  bounded  by  ([R  -  1]/R)minh  { l/(nh  -  1)}. 
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APPENDIX  B:  Justifying  the  Delete-a-Group  Jackknife 
for  a  Restricted-Regression  Estimator  Under  a  Two-Phase  Sample 
(Including  When  the  “Second”  Phase  is  a  String  of  Phases) 

Consider  the  estimator,  t  =  Xs  wiYi»  where  w,  have  the  same  definition  as  in  equation  (2),  and 
y;  is  a  value  of  interest  for  element  i.  For  ease  of  exposition,  we  will  let  f;  be  the  inverse  of  the 
first-phase  selection  probability  of  element  i,  which  we  assume  to  be  large  for  all  i.  It  is  a  simple 
matter  to  use  induction  to  cover  the  situation  where  f;  is  itself  the  result  of  several  phases  and 
calibrations. 

We  also  assume  that  neither  phase  sample  is  Poisson.  Poisson  sampling  in  the  first  phase  is 
discussed  in  Appendix  D.  Poisson  sampling  in  a  later  phase  is  identical  to  an  additional  stage 
of  sampling  because  Poisson  sampling  is  independent  from  one  selection  to  the  next. 

Let  B  =  (EuX.’Xj)'1  ZuXi’Yi,  where  U  is  the  set  of  all  elements  in  the  population,  and 
U;  =  Y;  -  x,B.  Equation  (2)  allows  us  to  rewrite  t  as  t  =  r|B  +  Xs  wiui-.  and  the  variance  of  t  as 
Var(t)  «  Var(r)B)  +  Var(EswjUj )  +  2Cov(rjB,  Xswiui)-  Now  r\  has  no  variance  when  it  comes 
from  the  frame,  and  Var(t)  collapses  to  Var(XwiUj)  • 

Let  O  and  o  define  asymptotic  orders  (z  =  O(m)  means  lim^.  |z|/m  is  a  constant;  z  =  o(m) 
means  limm_„  |z|/m  =  0).  We  assume  that  equation  (2)  holds  for  almost  all  elements  in  the 
sample  (i.e.,  it  fails  at  most  oP(m)  times,  where  m  is  the  size  of  S).  As  a  result,  X  w,u,  = 

X  (fi  /pi)u,  under  mild  conditions,  we  assume  to  hold  (this  is  because,  treating  each  (f  /p)  a,s 
0(1),  (T1  *- Lies*  [fi/pJXiX  Lies*  ft  /pJXj'x^X*  [fj  /Pjfrj'Uj  =  0P(\/m)0P(l/m)0P(ym)  is  ignorably 
small  compared  to  t  =  0P(m);  note  that  the  equality  X  x/Uj  =  0  has  a  role  in  making  this 
contention  viable).  Thus,  Var(X  WjUj)  is  approximately  the  variance  of  a  double-expansion 
estimator,  X  (fj  /pj)u,.  Assuming  the  second-phase  samples  within  each  second-phase  stratum  are 
large,  results  in  Kott  (1990,  p.  103)  show  the  single-phase  variance  estimator  with  estimated 
primary  sampling  unit  values  put  in  place  of  actual  values  will  over-estimate  the  variance  of  a 
double  expansion  estimator  unless  the  sum  of  the  fjUj  in  the  second-phase  strata  before 
subsampling  are  equal  to  zero  (note  that  since  both  md  and  nhin  equation  (B)  of  Kott  are  large, 
only  ed  2  matters). 

Kott  assumed  stratified  simple  random  sampling  in  both  phases,  but  extensions  to  stratified 
systematic  probability  sampling  from  randomly-ordered  lists  are  straight-forward.  For  the  first- 
phase  sample  all  the  fj  must  be  large  (as  in  the  simple  random  sampling  case),  so  that  the  with- 
replacement  expression  for  variance  is  appropriate.  For  the  second-phase  sample,  the  selection 
probabilities  and  population  must  be  such  that  the  approximation  pik  =  (rr^  -  Up^/n^  holds  (see 
Hartley  and  Rao  1962),  where  pik  is  the  second-phase  joint  selection  probability  of  two  elements, 
i  and  k,  from  second-phase  stratum  d,  and  n^  is  the  number  of  sampled  elements  in  that  stratum. 
In  most  NASS  applications,  the  second-phase  of  selection  is  unstratified,  which  is  equivalent  to 
d  being  all  the  elements  in  the  first-phase  sample. 
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The  second-phase  variance  of  Zs  W;U,  originating  from  second-phase  stratum  d  can  be  expressed 
by  (we  are  assuming  is  large) 

Var2d  =  I  (f,u,  )2(1  -  Pi)/pi  +  Z  (fiUiKfkUjlPik  -  p,Pk)/(p,Pk) 

=  Z  (fiUi  )2(1  -  Pi)/Pi  -  Z  (fiui)(fkuk)pipk  /md 
=  Z  (f,u,)2(l  -  [(m^  -  U/mJp.Vpi  -  (Z  f,Ui)2/md 

*  z  (f,u,)2(l  -  Pi)/p,  -  (Z  f.Ui)2/^ 

where  the  summations  are  over  all  elements  in  second-phase  stratum  d  before  the  second-phase 
of  sampling  takes  place.  This  is  analogous  to  equation  (3)  in  Kott. 

Note  that  Var2d  £  Z  (fiui)2(l  "  PiVPn  where  equality  holds  only  when  Z  f;U;  =  0.  This  turns  out 
to  be  the  reason  (not  directly  proven  here)  why  the  delete-a-group  jackknife  over-estimates  the 
variance  of  t  when  the  sum  of  the  fjU;  within  all  second-phase  strata  are  not  approximately  zero. 

Observe  that  given  any  column  vector  A  of  the  same  dimension  as  x;',  ZF^'Xi'fjU;  ~ 

ZuA'x/u,  =Zu^'Xi,[y1-xi(ZuXi'xi)-1ZuXi,yi]  =  0.  Since  Zp^'Xi'fiUi  =0  for  any  A,  when  there 
exists  a  Ad  for  every  second-phase  stratum  d  such  that  XjAd  =  A'dX;'  equals  1  when  i  is  in  d  and 
0  otherwise,  then  the  sum  of  the  fjUj  in  any  second-phase  strata  before  subsampling  is 
approximately  (asymptotically)  zero. 

Applying  the  weights  defined  by  equation  (3)  to  t,  we  get  t(r)  =  T|(r)B  +  Zsm  [fjw  /fj]WjUj  = 

Zs(r)  [fj(r>  /Pj]u3.  The  second  part  of  the  first  s  approximation  makes  use  of  the  facts  that  the 
components  of  (rj(r)  -  Ziesw  [fj(r)  /fjjWjXj)  are  0P(m/R),  while  the  diagonal  components  of 
ZieS(r)  [f,(r)  /f^WjXi'Xj  are  Op(m)  under  mild  conditions. 

When  q  =  q(r)  is  a  frame  value,  t(r)  -  t  ~  Zs(r>  [fjco  /pj]Uj  -  Zs  [fj  /pj]Uj.  It  is  straight-forward  to 
show  that  the  delete-a-group  jackknife  estimates  Var(t)  =  Var(ZsWiUi)  =  Var(£s  [fj  /pjjuj)  fairly 
well  with  the  possibility  of  being  upwardly  biased  when  the  sum  of  the  f^  before  subsampling 
in  one  or  more  of  the  second-phase  strata  is  not  equal  to  zero. 

When  "q  =  Zf  then  qB  =ZpfiXjB,  and  t  can  be  rewritten  as  t  =  Zf^Yi  + 

(Zs  -  ZfW  =  Zf  ^iYi  +  (Zs[fi/P,]ui  -  ZfW  =  Zf  fiCYi  +  { [li  /Pi]  -  IK),  where  lt  =  1 
when  i  is  in  S  and  zero  otherwise  .  In  a  similar  fashion,  t(r)  =  Zf fi(r>(yi  +  {Pi /pi]  ~  1}uj).  ^  1S 
straight-forward  to  show  that  the  delete-a-group  jackknife  estimates  the  conventional  multi -stage 
variance  estimator  ignoring  fpc  at  the  first  stage,  which  in  turn  estimates  Var(t)  fairly  well  but 
has  the  possibility  of  being  upwardly  biased  when  the  sum  of  the  f;Uj  before  subsampling  in  one 
or  more  of  the  second-phase  strata  is  not  equal  to  zero  (see  Kott  and  Stukel  [1997]  for  some 
missing  details). 
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Extension  of  the  above  result  to  a  sample  design  where  the  “second -phase”  sample  is  itself  the 
result  of  a  string  of  phases,  all  within  the  same  second-phase  strata,  is  a  simple  matter.  We  need 
only  assume  that  pik*  =  ap*pk*  where  Pj*  (plk*)  denotes  the  appropriate  product  of  conditional 
(joint)  selection  probabilities,  and  a  =  1  -  0(l/md).  Appendix  C  has  more  on  the  sequence-of- 
sampling-phases  methodology  used  in  the  ARMS  design. 
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APPENDIX  C:  Justifying  the  Delete-a-Group  Jackknife 
for  Certain  Composite  Estimators  in  the  ARMS 

We  restrict  attention  in  this  appendix  to  the  unusual  composite  estimators  used  with  the  ARMS 
surveys. 

To  show  that  the  delete-a-group  jackknife  works  for  a  composite  estimator  like  the  Phase  III 
Com-for-Grain/Beef  CRR  described  in  the  text,  one  needs  to  show  that  it  works  when  estimating 
a  total  for: 

1)  the  intersection  of  the  two  original  target  populations  (list  farms  with  grain  corn  and  at  least 
10  weaned  calves), 

2)  each  of  the  two  “rump”  populations  that  contain  elements  in  one  population  but  not  the  other, 
and, 

3)  the  union  of  the  two  rumps  and  the  intersection. 

The  delete-a-group  jackknife  works  for  estimates  of  a  rump  total  because  it  works  for  domains 
(by  defining  item  values  within  one  target  population  as  zero  for  farms  outside  the  domain),  and 
it  works  for  estimated  totals  in  the  union  -  assuming  it  works  for  estimated  totals  in  the 
intersection  -  because  it  works  for  functions  of  estimators  ( like  the  sum  of  the  totals  in  the  two 
rumps  and  the  intersection).  We  discuss  intersections  below. 

Let  us  call  the  two  samples  we  are  compositing  A  and  B.  In  principle,  we  can  estimate  an  item 
total  in  the  intersection  of  the  target  population  using  either  sample.  Let  tc  =  £c  WjCy;  be  the 
estimated  total  calculated  using  sample  C  (  =  A  or  B),  and  let  t  =  A,tA  +  (l-A.)tB  be  the 
composite  total.  Note  the  y-,  is  defined  to  be  zero  for  farms  outside  the  intersection. 

The  ARMS  samples  are  drawn  sequentially  to  avoid  overlap  using  an  unstratified  variant  of 
systematic  unequal  probability  sampling  at  every  phase  after  the  initial  screening  phase.  Let  ntl 
be  the  probability  of  selecting  farm  i  for  sample  t  given  that  it  is  available  for  sampling  after 
sample  t-1  is  drawn.  Let  t=  1  denote  the  first  sample  drawn  after  the  screening  sample.  Finally, 
let  pi1  =  (1  -  ttj1)  (1  -  TTit_1)7t .  Note  that  71*  =  0  when  farm  i  is  not  in  the  target  population  for 
sample  s.  Without  loss  of  generality,  we  will  assume  sample  A  was  selected  before  B,  and  that 
A,  B,  and  their  intersection  are  of  size  O(m). 

Using  arguments  similar  to  those  in  the  previous  appendix,  we  can  see  that 

tc  *  If  tyi  +  (Zc  ft  -  Zf  f,u,c), 

where  u,c  =  y;  —  (Z  Yk  /Z  xk°)xiCi  the  summations  being  over  the  farms  in  the  population  that  are 
in  the  same  calibration  group  as  i  when  computing  tc,  and  xkc  is  the  x-value  of  farm  k  when 
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computing  tc.  Observe  that  the  first-phase  sample  applies  to  both  A  and  B  since  there  is  one 
.ARMS  screening  sample  for  all  purposes.  By  contrast  x  and  u-values  as  well  as  calibration  group 
memberships  may  differ  across  samples  for  the  same  i.  This  happens  when  the  1996  Phase  II 
Com-for-Grain  PPCR  is  composited  with  the  Phase  II  Corn  PPR;  XjA  is  corn-for-grain  acres  for 
farm  i,  while  xB  is  general  com  acres. 

We  can  now  see  that 


Var(t)  =  Var,(£Ffiyi)  +  E,{Var2(  £A  [fj  /p,A]UiA)  +  Var2(  £„  [f,  /p,>,B)  + 

2Cov2(  LJfiVK,  IB[f,/p>iB)}. 

Now 

Cov2(  Za  [fi /PiA]UiA,  Eb  [f,  /PiB]u,B)  =  EieF(A)EkeF(B)(fiUiA)(fkUkB){[(piAkB)/(PiAPkB)]  "  1}, 

where  F(C)  denotes  that  part  of  the  first-phase  sample  in  the  target  population  for  sample  C,  and 
PiV  =  ^i-k*1  iti*k*A_1‘jrik*A(  1  -  7tkA+1) (1  ~  trkB1)Tck,  when  tc^,1  is  the  probability  of  selecting 

neither  farm  i  nor  k  for  sample  t  providing  both  are  available  after  sample  t-1,  and  7iik*A  is  the 
probability  of  selecting  farm  i  but  not  k  for  sample  A  given  that  both  are  available  after  A-l. 
We  will  assume  that  the  design  is  such  that  given  k*i,  pAkB  =  (1  +  ak)piApkB,  where 
ak  =  0(l/m)  (if  7ti*j*s/7ii*sTtj*s  ~  1  for  all  s  <  A,  then  the  assumption  is  equivalent  to 
7tjk,A  =  7tk*A  Prob(i  chosen  for  A|k  not  chosen  for  A)  =  7ik„A7t  A(1  +  ak)).  Since 
Efcb)  fjU,B  ~  0  and  p  A  B  =  0,  summing  the  left  hand  size  of  the  last  expression  for  Ct  ^ver  i 

yields  a  term  of  order  1/m.  Summing  then  over  k  yields  a  term  of  order  1.  Since  tA)  is 

O(m),  the  covariance  term  is  asymptotically  ignorable. 

It  is  now  not  hard  to  show  using  arguments  developed  here  and  in  the  previous  appenc  that 
the  delete-a-group  jackknife  is  unbiased  for  t  under  conditions  we  assume  to  hold. 
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APPENDIX  D:  Justifying  the  Delete-a-Group  Jackknife 
with  Finite  Population  Correction  for  a  Single-Phase  Poisson  Sample 

Suppose  we  have  a  calibrated  estimator  for  a  total,  t  =  £sWjyj,  where  the  w3  satisfy  equation  (6): 


Wj  =  l/TUj  +  (T|*  -  Xies*  [1/ttJXjX  E.eS*  [  1 /TTJXj ’ Xj)1  [1/ltJXj'  (6) 

for  j  e  S*,  and  a  predetermined  value  otherwise.  In  addition,  there  is  a  vector  A,  such  that 
XjA.  =  y/(\  -  TUj)  for  all  j. 

Let  B  =  (EuX.'Xj)'1  Eux,'yi,  where  U  is  the  set  of  all  elements  in  the  population,  and 
u,  =  Yi  -  x,B.  Now  t  =Is  w^  =  Es  w^B  +  Uj)  =  T)  +EswjUj.  Consequently,  Var(t)  = 
Var(  Xs  wjuj)  =  Var(  £sUj/itj)  tinder  mild  conditions  we  assume  to  hold. 

If  the  sample  is  Poisson,  we  have  Var(t)  »  Xuuj2(l  -  7ij)/7Uj,  which,  in  principle,  can  be  estimated 
in  a  nearly  unbiased  fashion  by  var(t)  =  £suj2(l  -  TCj)/7Tj2  »  £s  w7(l  -  l/Wj)Uj2  =  £s  (Wj(v))2Uj2, 
where  Wj(v)  =  Wj\/(1  -  l/wj)  (see  equation  (7)). 

Using  the  definition  of  wj(r)(v)  in  equation  (8),  we  have  t(r)(v)  =Es  wj(r)(v)yj  ~  Es  wj(v)xjB  + 

Esw  w/v>Uj •  Consequently,  t(r)  -  t  =  £S(r)  Wj(v)Uj  -  £s  w/v)Uj.  Now  the  n^  members  of  S(r)  can  be 
viewed  as  a  simple  random  subsample  of  the  n  members  of  S.  Since  n/n^  =  1  +0P(1/R), 
t/rt  —  t  ~  (n/mrt)Esfrt  wifrt<v)ui  -  Eswifrt(v)Ui.  Using  arguments  similar  to  ones  made  in  Appendix  A, 
we  have  E^t^  -  t(v))2]  *  (n/n^Xl  -  n<r)  /n)[  £s  (w/^)2^2  -  (Es  w/^X/n] .  Now 
Es  wi(v)u, /Vn  *  Eu(!  "  l/Wi)1/2Uj /v^n  =  Eu(l  -  7ui)1/2ui >/n  =  Eu  ^’x/u;  iVn  =  0,  since  A/x/  = 
XjA.  =v/(l  —  7tj)  for  some  A.,  while  Eu x.’Uj  =  0. 

Continuing,  Eo[(t(r)(v)  -  t(v))2]  «  (n/n^Xl  -  n<r)/n)  Es  (Wj(v))2Uj2  =  (iv/n^)  £s  (wj(v))2uJ2,  where  nr 
is  the  size  of  Sr.  From  here  it  is  easy  to  see  that  vJ(fpcT)  is  nearly  unbiased  for  Es  (w/^)2^2,  which 
is  turn  is  nearly  unbiased  for  Var(t). 

Observe  that  when  EuU  ~  7ti)1/2Uj  36  0,  the  jackknife  is  biased  downward.  In  practice,  this  may 
be  of  little  importance  because  if  we  felt  that  (1  -  7 ti)1/2ui/v'n  had  an  absolute  value  far  from 
zero,  we  would  include  V(1  -  TCj)  as  a  component  of  Xj. 
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